Department of Statistics and Probability COLLOQUIUM
نویسنده
چکیده
Distance weighted discrimination (DWD) is a margin-based classifier with an interesting geometric motivation. DWD was originally proposed as a superior alternative to the support vector machine (SVM), however DWD is yet to be popular compared with the SVM. The main reasons are twofold. First, the state-of-the-art algorithm for solving DWD is based on the second-order-cone programming (SOCP), while the SVM is a quadratic programming problem which is much more efficient to solve. Second, the current statistical theory of DWD mainly focuses on the linear DWD for the high-dimension-low-samplesize setting and data-piling, while the learning theory for the SVM mainly focuses on the Bayes risk consistency of the kernel SVM. In fact, the Bayes risk consistency of DWD is presented as an open problem in the original DWD paper. In this work, we advance the current understanding of DWD from both computational and theoretical perspectives. We propose a novel efficient algorithm for solving DWD, and our algorithm can be several hundred times faster than the existing state-of-the-art algorithm based on the SOCP. In addition, our algorithm can handle the generalized DWD, while the SOCP algorithm only works well for a special DWD but not the generalized DWD. Furthermore, we consider a natural kernel DWD in a reproducing kernel Hilbert space and then establish the Bayes risk consistency of the kernel DWD. We compare DWD and the SVM on several benchmark data sets and show that the two have comparable classification accuracy, but DWD equipped with our new algorithm can be much faster to compute than the SVM. To request an interpreter or other accommodations for people with disabilities, please call the Department of Statistics and Probability at 517-355-9589.
منابع مشابه
FUZZY INFORMATION AND STOCHASTICS
In applications there occur different forms of uncertainty. The twomost important types are randomness (stochastic variability) and imprecision(fuzziness). In modelling, the dominating concept to describe uncertainty isusing stochastic models which are based on probability. However, fuzzinessis not stochastic in nature and therefore it is not considered in probabilisticmodels.Since many years t...
متن کاملTesting a Point Null Hypothesis against One-Sided for Non Regular and Exponential Families: The Reconcilability Condition to P-values and Posterior Probability
In this paper, the reconcilability between the P-value and the posterior probability in testing a point null hypothesis against the one-sided hypothesis is considered. Two essential families, non regular and exponential family of distributions, are studied. It was shown in a non regular family of distributions; in some cases, it is possible to find a prior distribution function under which P-va...
متن کاملCOLLOQUIUM Department of Statistics and Probability Michigan State University
The basic message of this talk could have been delivered a long ago, may be even soon after the time of publication of classical papers of K. Pearson (1900) and R. A. Fisher (1922, 1924). However, the tradition of using the chi-square goodness of fit statistic became so widely spread, and the point of view that, for the case of discrete distributions, statistics “have to” have their asymptotic ...
متن کاملProbability-possibility DEA model with Fuzzy random data in presence of skew-Normal distribution
Data envelopment analysis (DEA) is a mathematical method to evaluate the performance of decision-making units (DMU). In the performance evaluation of an organization based on the classical theory of DEA, input and output data are assumed to be deterministic, while in the real world, the observed values of the inputs and outputs data are mainly fuzzy and random. A normal distribution is a contin...
متن کاملProbability Distribution Fitting to Maternal Mortality in Nigeria.
The consequences of Maternal Mortality (MM) cannot be overemphasized. It inhibits population growth resulting into loss of lives among others. This work tends to obtain the maternal mortality rates (MMR) in Nigeria, identify some fitted distributions to MMR and determine which of the distributions best fits the data. A comprehensive Exploratory Data Analysis (EDA) was carried on MM and the MMRs...
متن کاملA continuous approximation fitting to the discrete distributions using ODE
The probability density functions fitting to the discrete probability functions has always been needed, and very important. This paper is fitting the continuous curves which are probability density functions to the binomial probability functions, negative binomial geometrics, poisson and hypergeometric. The main key in these fittings is the use of the derivative concept and common differential ...
متن کامل